Document Indexing With a Concept Hierarchy

نویسندگان

  • Alexander F. Gelbukh
  • Grigori Sidorov
  • Adolfo Guzmán-Arenas
چکیده

We discuss the task of selection of the concepts that describe the contents of a given document. We propose to use a large hierarchical concept dictionary (thesaurus) for this task. A statistical method of document indexing driven by such a dictionary is proposed. The problem of handling non-terminal nodes in the hierarchy is discussed. Common sense-complaint methods of automatically assigning the weights to the nodes and links in the hierarchy are presented. The application of the method in a system Classifier is discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Conceptual document indexing using a large scale semantic dictionary providing a concept hierarchy

Automatic indexing is one of the important technologies used for Textual Data Analysis applications. Standard document indexing techniques usually identify the most relevant keywords in the documents. This paper presents an alternative approach that aims at performing document indexing by associating concepts with the document to index instead of extracting keywords out of it. The concepts are ...

متن کامل

Document Indexing with a Concept Hierarchy Índice de Documentos con una Jerarquía de Conceptos

Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semi-automatic translation of the hierarchy into different languag...

متن کامل

Indexing with a Concept Hierarchy

Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semiautomatic translation of the hierarchy into different language...

متن کامل

Document Retrieval through Concept Hierarchy Formulation

The enormous growth of the Internet and the widespread use of computer systems in general created very large collections of electronic documents, and methods existing so far have proved unable to handle the massive amount of unstructured documents. In this article we discuss a variant of document retrieval, where traditional indexing is augmented by concept hierarchy (composed by observing conc...

متن کامل

IDSIS: Intelligent Document Semantic Indexing System

System Zhongzhi Shi Bin Wu Qing He Xiujun Gong Shaohui Liu Yi Zheng [email protected] Key Laboratory of Intelligent Information Processing , Institute of Computing Technology ,Chinese Academy of Sciences Abstract: With rapid growth of the Internet, how to get information from this huge information space becomes an even important problem. In this paper, An Intelligence Document Semantic Indexi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001